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Abstract 

This study investigated whether video-based materials can facilitate second language learners’ text 
comprehension at the levels of macrostructure and microstructure. Three classes inclusive of 98 
Chinese-speaking university students joined this study. The three classes were randomly assigned to three 
treatment groups: on-screen text (T Group), concurrent narration with on-screen text (NT Group), and video with 
concurrent narration and on-screen text (VNT Group). The data were collected through the macrostructure and 
microstructure reading comprehension pre- and post-tests and the immediate test. The statistic results of the 
immediate test and the post-tests showed that the VNT group performed significantly better on the 
macrostructure comprehension than the T and NT groups. Armed with the perspectives of multiliteracies and the 
significant results, the study makes instructional recommendations to integrate video in second-language reading 
comprehension instruction. 
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1. Introduction 

The technology advancement and the boom of various handheld smart devices allow access to visual support 
learning tools wherever and whenever. It is common for an individual to expose himself/herself in a visualized 
surrounding with multiple handheld technological devices. To prepare students well for confronting the rapidly 
changing and globalized world, the New London Group (NLG) (1996) proposed a multiliteracies pedagogy with 
the emphasis on multi-dimensions of literacy and claimed that literacy teaching may go beyond written language 
and further teach students to interpret meanings of text through other modes, such as visual representation, audio 
representation, tactile representation, gestural representation, and spatial representation (Kress, 2003). Video, a 
kind of visual support, can be used to enhance second language (L2) learning. 

Multimedia instructional environments are widely recognized as having great potential for improving the way 
that people learn (Mayer, 2005; Plass & Jones, 2005). The term “multimedia environment” refers to settings in 
which information is presented in more than one format, such as in both a verbal and nonverbal form or in an 
audio and visual form (Mayer, 2005; Moore, Burton, & Myers, 2004). Multimedia material saturated with video 
plays an essential role in assisting L2 learning. Using video as a teaching tool, some researchers have found that 
it can foster L2 vocabulary learning, as well as listening and reading comprehension (Chun & Plass, 1996a, 
1996b; Jones & Plass, 2002), and the results of these studies further support Paivio’s dual coding theory (DCT) 
of cognition (2007). In defining DCT, Paivio (2007) explained that “cognition involves the cooperative activity 
of two functionally independent but interconnected systems, a nonverbal system specialized for dealing with 
nonlinguistic objects and events, and a verbal system specialized for dealing directly with language” (p. 33). In 
this way, video can promote L2 learning when used in ways that are consistent with the dual-code assumption. 

However, the information displayed through multimedia, such as video, audio, and captions, may result in the 
learner’s cognitive overload. Due to limited working memory capacity (Baddley, 2007), the learner may not be 
able to simultaneously process the information in different modalities. The effect of input modality (video, audio, 
and captions) on L2 learning needs further investigation with respect to different levels of comprehension. 

The major purpose of this study is to examine whether video-based materials can foster L2 learners’ abilities to 
understand L2 text in terms of the macrostructure and microstructure levels of comprehension. The cognitive 
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processes related to comprehension range across the entire spectrum of mental representations (Caccamise, 
Snyder, & Kintsch, 2008). Multiple levels of proposition representations may exist, ranging from the 
micropropositions expressed in the text itself to various levels of macropropositions constructed by the learner 
(Kintsch, 1998). The present study assumes that comprehension occurs on both macro- and micro-levels. The 
benefit of dividing comprehension into levels is that each level can be further subdivided into specific skills, and 
this approach can make comprehension instruction seem concrete and manageable (Robinson & McKenna, 
2008). Comprehension inteipreted from the aspect of macrostructure and microstructure levels has not been fully 
considered in previous L2 multimedia research. With the advance of technology, video-based materials have 
been widely uploaded on the Internet and used in the L2 context. Further investigations of the effects of 
video-based materials are needed to address L2 learners’ text comprehension. 

2. Literature Review 

2.1 Dual-Code Processing 

As Plass and Jones (2005) indicated, an integration of text with pictorial cues fosters L2 learning. The positive 
effects of a combination of pictorial and verbal cues can be explained from a dual-code processing perspective 
(Paivio, 2007). Paivio’s DCT (2007) assumes that there are two subsystems processing incoming verbal and 
nonverbal stimuli. Humans’ mental structures are associative networks of verbal and nonverbal representations. 
The verbal representations include “visual, auditory, articulatory, and other modality-specific verbal codes” 
(Clark & Paivio, 1991, p. 151), while the nonverbal representations contain “modality-specific images for shapes, 
environmental sounds, actions, skeletal or visceral sensations related to emotion and other non-linguistic objects 
and events” (Clark & Paivio, 1991, p. 151). The assumption of DCT implies that the learner is capable of dealing 
with incoming verbal and image information simultaneously when viewing video materials. 

2.2 A Pedagogy’ of Multiliteracies 

Second language learners can be trained with a multiliteracies pedagogy to expand their text comprehension 
ability. The multiliteracies pedagogy developed in 1994 by the NLG aims to integrate a wide range of cultural, 
linguistic, communicative, and technological diversity into classroom teaching to help students to prepare 
themselves well for a rapidly changing, globalized world. The NLG (1996) claimed that “how negotiating the 
multiple linguistic and cultural differences in [the] society is central to the pragmatics of the working, civic, and 
private lives of students” (p. 60). A pedagogy of multiliteracies emphasized the conception of multimodality, 
which refers to all forms of representation and language learners design meaning by means of different modes, 
such as written and oral language, visual representation, audio representation, tactile representation, gestural 
representation, and spatial representation (Kress, 2003). Cope and Kalantzis (2009) indicated that many modes 
were encouraged to be used in different forms of expression and further assumed that “where writing is found, 
visual supports allow a simplified syntax for the writing itself, in the form, for instance, of a decreasing clausal 
complexity” (p. 15). Visual supports hence may facilitate text comprehension. 

People across the world may use different technological devices and communication channels to express 
themselves and communicate with others. To be successfully involved in such a globalized changing world, the 
language educator may not only focus on teaching linguistic knowledge of English but also pay attention to 
cultural, communicative, and technological diversity. Judging from the perspectives of the multiliteracies 
pedagogy, English learners may comprehend English text successfully when the text is supported with visual 
images. The present study intended to integrate video, a kind of visual support, with verbal information and 
further examine the effects of video-based materials on the learners’ English text comprehension in the EFL 
context. 

2.3 Effects of Videos on Second Language Learning 

Language instructors and researchers have long been interested in using images and pictures to enhance L2 
learning (Al-Seghayer, 2001; Chun & Plass, 1996a, 1996b; Sydorenko, 2010). This is probably because the 
construction of “propositions and of mental models require[s] simultaneous availability of corresponding text 
information and picture information in working memory” (Schnotz, 2005, p. 61). In general, verbal information 
associated with pictorial images can be learned more successfully than are those without such additional 
information. 

Video-based materials have been examined and shown significant positive results. For example, videos have 
been found to aid vocabulary learning (Chun & Plass, 1996a, 1996b; Jones & Plass, 2002; Lin, 2010; Sydorenko, 
2010), as well as listening and reading comprehension (Chappie & Curtis, 2000; Herron, et al., 2006; Plass, 
Chun, Mayer, & Leutner, 1998; Winke, Gass, & Sydorenko, 2010). For example, using video advanced 
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organizers, Chun and Plass (1996a, 1996b) found that visually and verbally annotated words resulted in better 
general text comprehension than those with only verbal annotation and those without any annotation. With 
English-speaking college students enrolled in a German course as the participants, the results of Plass, Chun, 
Mayer, and Leutner’s (1998) study revealed that students remember word translations better in the condition of 
visual (i.e., picture or video) and verbal (i.e., text translation) annotations than in the conditions of written 
annotations alone, pictorial annotations alone, or with neither form of annotation. Examining the effects of the 
four listening treatments on the university students’ comprehension of the aural passage, Jones and Plass (2002) 
further found that the participants selecting both written and pictorial annotations outperformed those in the other 
three treatments (i.e. only written annotations, only pictorial annotations, or no annotation). Herron et al. (2006) 
conducted a study to compare a video instructional package and a text-based instructional package, and verified 
that intermediate-level college French students in the video-based course significantly improved not only their 
grammar knowledge but also their listening skills. Some studies have also proved that video may not only 
enhance L2 learners’ four language skills, grammar knowledge but also critical thinking ability. For example, 
with 31 Cantonese tertiary-level students in Hong Kong as the participants, Chappie and Curtis (2000) stated that 
the participants had improved their English analytical and critical thinking. 

More recently, researchers confirmed that video-based materials could enhance second language learning. For 
example, Sydorenko (2010) examined the effects of video, audio, and captions on the learning of written and 
aural words. The results showed that the group receiving video, audio, and captions (VAC), and the group 
receiving video and captions (VC) scored higher than the group receiving video and audio (VA) on the written 
recognition of word forms. These results suggest that the VAC combination including verbal and nonverbal 
information is more suitable than the VA combination including verbal information for learning the meanings of 
new words. For a specific examination on L2 vocabulary acquisition, Lin (2010) designed a video-based CALL 
program for a Taiwanese university. The results revealed that video-based materials significantly improved 
less-proficient participants’ incidental vocabulary acquisition and text comprehension; both proficient and 
less-proficient participants’ vocabulary acquisition was positively related to their video comprehension (Lin, 
2010). In general, the studies on the effects of pictorial cues provide evidence for Paivio’s DCT (2007), assuming 
that memory for verbal information can be improved if a corresponding visual is simultaneously presented or if 
the learner can imagine a visual image to go with the verbal information. 

However, another path of research draws attention to whether adding videos to the material can hinder L2 
learning due to the learner’s working memory (WM) capacity. WM acts as “the temporary storage of information 
that is being processed in any of a range of cognitive tasks” (Baddeley, 1986, p. 34). However, WM is limited in 
its capacity to store and maintain input information and may not successfully handle different input modalities of 
information simultaneously, and therefore the learner may experience cognitive overload and fail to comprehend 
the text successfully. The contradiction between the learner’s cognitive load caused by the presentation format 
and the learner’s limited WM capacity can be a considerable challenge for multimedia learning. To investigate 
the manifestations of this challenge, the present study intended to compare the effects of different presentation 
conditions which differed in verbal and nonverbal input modalities. 

2.4 Macrostructure and Microstructure Comprehension 

Cognitive science researchers have attempted to describe the various processes involved in reading 
comprehension. In fact, comprehension is a continuous process, demanding control of mental connections from 
the learner’s memory representations. To manipulate the reading comprehension studies empirically and to 
implement instruction concretely and manageably, some researchers have categorized comprehension as 
dual-level. This dual-level concept of comprehension obtained its name in a variety of ways. For example, 
viewing reading as a paradigm of cognition processes, Chun and Plass (1997) regarded comprehension as the 
interaction of bottom-up processes (vocabulary acquisition) with top-down processes (activating prior 
knowledge). Bottom-up approaches assume that “the written text is hierarchically organized on the letters, words 
and word groups, and that the reader first processes the smallest linguistic units, gradually compiling the smaller 
units to decipher and comprehend the higher unit” (Lin, 2004, p. 31). On the other hand, top-down approaches 
assume that “reading begins with knowledge and hypotheses in the mind of the reader” (Lin, 2004, p. 33); the 
reading comprehension process is driven by concepts. Other paired terms used in previous studies include 
general versus local (Block, 1986), deep versus shallow (Graesser, 2008), higher versus lower (Pressley, 2000), 
and macrostructure versus microstructure (Kintsch, 1998). These dichotomies refer to the same range of 
information processes from local, meaning operation in the text itself, such as decoding word meanings and 
identifying syntactic relations, to overall meaning operation, such as establishing coherent memory 
representation of the text, or arriving at an overall impression of the content of the text. 
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Following the work of van Dijk and Kintsch (1983) and Kintsch (1998), the present study focused on comparing 
the learner’s ability to understand linguistic information in the text from smaller to larger linguistic units, such as 
words, phrases, and sentences up to the whole text. The assumption that comprehension is processed at micro- 
and macro-levels may be attributed to text organization which contains individual local and global structures that 
yield two types of mental representation structures: micro- and macro-propositions (Kintsch, 1998). According to 
Kintsch (1998), text mental propositions are generated from words and phrases in the text itself, and are 
connected together. Micropropositions refer to the interconnection of some propositions with their previous and 
subsequent propositions, and the formulation of local meaning relationships (or the microstructure connections). 
For example, the individual propositions expressed by the words, phrases, and sentences of a text can be 
regarded as micropropositions (Kintsch, 1998). Regarding reading comprehension, the process in which readers 
rely on micropropositions to understand the text can be regarded as MICS comprehension. Kintsch contrasted 
microstructure connections with macrostructure connections. Macropropositions refer to the interaction of some 
propositions with higher-level concepts in the activated knowledge net, and the revealing of more global 
relationships in the text. A summary of the text is one type of macrostructure connection. When readers tend to 
make a macrostructure connection during the process of reading, such type of comprehension can be regarded as 
MACS comprehension. Both MACS and MICS comprehension plays an essential role in reading 
comprehension. 

From the reviewed literature related to reading comprehension and multimedia instruction, some gaps exist in 
the investigation of MACS and MICS comprehension. First, the text comprehension model developed by 
Kintsch and van Dijk has been well investigated in recent years as it relates to first language (LI) reading 
comprehension (Kintsch & Yarbrough, 1982), and it is currently one of the most widely accepted scientific 
models of text comprehension in the literature (Nassaji, 2006). However, it appears that its full potential 
application in L2 text comprehension in multimedia environments has not yet been explored. The second gap is 
that the aforementioned video research examined general comprehension rather than the macrostructure and 
microstructure of comprehension. The present study therefore investigated the effects of video on L2 learners’ 
MACS and MICS comprehension. 

The third gap is the study of video-based materials in the EFL context. That is, the need exists to use authentic 
videos as the research material. Authentic videos are video material produced for native speakers of a language. 
Reviewed literature verified that videos are useful for visualizing processes, and can clarify complex ideas and 
make them easier to remember; moreover, videos can enhance understanding of those concepts that are difficult 
to explain verbally (Ciccone, 1995; White, Easton, & Anderson, 2000). In most EFL classrooms, there are not 
many native English speakers. In such situations, video seems to be an ideal medium for introducing L2 learners 
to authentic input. Plass and Jones (2005) pointed out that one of the limitations of existing studies on second 
language acquisition with multimedia is the limited number of empirical studies that use authentic videos from 
the target culture to evaluate L2 learning outcomes. This study was therefore conducted to examine how 
authentic videos can support or complement reading comprehension in the EFL context. 

2.5 Research Questions 

The results of this study may help us understand whether video-based materials can effectively facilitate L2 
learners’ ability to comprehend written text. If Yes, the present study will go on to examine which level of 
comprehension can be fostered through the use of video-based materials. To achieve the aims of this study, the 
following research question is addressed: 

Among the three presentation conditions 1) pure text, 2) narration and text, and 3) video, narration, and text, 
which presentation condition significantly improves L2 learners’ MACS and MICS comprehension in terms of 
the immediate test and the reading comprehension post-tests? 

3. Method 

3.1 Research Design 

This study was a pre- and post-test research design. The treatment was conducted in five sessions over five 
weeks. Three presentation conditions were designed to test the differences: (1) the group instructed with 
on-screen pure text (T Group), (2) the group instructed with concurrent narration with on-screen text (NT Group), 
and (3) the group instructed with video accompanied with concurrent narration and on-screen text (VNT Group). 
During the treatment, an immediate test was conducted right after each session. Reading comprehension (RC) 
pre- and post-tests were conducted before and after the treatment. 
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3.2 Participants 

The pool of participants was a total of 98 undergraduate university students recruited from three English courses. 
Their ages ranged from 20 to 22, and their native language was Mandarin. The three classes were instructed by 
the same teacher. The number of participants in each group was as follows: the T group, n = 30; the NT group, n 
= 32; the VNT group, n = 36. 

RC MACS and MICS pretests were administered and their scores submitted to a one-way ANOVA analysis to 
test for possible differences among groups. Regarding MACS comprehension, the mean scores were 66.83 (SD = 
9.24), 65.94 (SD = 9.45), and 67.22 (SD = 11.68) for the T group, NT group, and VNT group, respectively. 
Regarding MICS comprehension, the mean scores were 64.67 (SD = 14.32), 65.93 (SD = 13.65), and 65.56 (SD 
= 14.63) for the T group, NT group, and VNT group, respectively. The results revealed no significant differences 
across the three groups in either MACS comprehension or MICS comprehension. These results suggest that all 
three groups were statistically similar in their MACS and MICS comprehension abilities before receiving the 
treatments. 

3.3 Instruments 

3.3.1 Reading Comprehension (RC) pre- and Post-tests 

The pre- and post-tests were used to investigate whether the participants made progress in their ability to 
understand written text at the MACS and MICS levels and also compare whether there were significant 
differences among the three conditions. Each test contained five passages, which were irrelevant to the contents 
of the five video-based materials in the treatment. The passages were directly adopted from the reading 
comprehension textbook Pauk (2000) at an introductory level. Each passage contained six question items. To 
answer the research questions, the author further divided the six items into MACS and MICS categories. The 
MACS comprehension questions required the students to get the gist of the text and provide responses that were 
not stated directly in the assigned material. In contrast, the MICS comprehension questions required the students 
to read phrase by phrase or sentence by sentence and provide answers that were clearly stated somewhere in the 
target material. Each passage contained four MACS questions and two MICS questions. The four MACS 
comprehension strategies were coded as follows: identifying main ideas (MI), synthesizing the subject matter of 
a passage (SM), drawing conclusions from a passage (CON), and identifying the writer’s devices (DEV). The 
two MICS comprehension strategies included searching for supporting details (SD) and decoding the meaning of 
vocabulary (VOC). 

The pre- and post-tests respectively include the scores of MACS and MICS comprehension. Participants 
received one point for each correct answer; each passage had a possible total of four points for MACS 
comprehension and two points for MICS comprehension. The maximum scores of MACS comprehension and 
MICS comprehension in the pre-test were thus twenty and ten, respectively. The maximum scores of MACS 
comprehension and MICS comprehension in the post-test were the same. For each group, the participants’ 
correct responses on both the MACS and MICS comprehension items were summed and converted into their 
respective correct percentage scores. Cronbach’s alpha inter-item reliability estimates for the pre- and post-tests 
were .81 and .83 respectively. 

3.3.2 The immediate RC Tests 

The immediate test focused on testing the participants’ ability to understand the assigned video immediately after 
viewing it. After viewing one video, the participants in the three groups took an immediate test. Because the 
treatment included five sessions of video viewing, five immediate tests were constructed. The tests also included 
MACS and MICS comprehension questions. The format of the immediate comprehension tests was 
multiple-choice questions with four options. The items were designed by the author by following Pauk’s 
categories (2000). An immediate test included ten question items: three MACS comprehension questions (that is, 
an SM question, a CON question, and a DEV question) and seven MICS comprehension questions (four SD 
questions and three VOC questions). Cronbach’s alpha inter-item reliability estimates for the five immediate 
tests ranged from .75 to .81. 

The immediate test included the scores of MACS and MICS comprehension. Participants received one point for 
each correct answer; the maximum score for each immediate test was ten. In each immediate test, a total of three 
points was possible for the MACS comprehension questions and seven points for the MICS comprehension 
questions. For the five tests, the maximum scores of MACS comprehension and MICS comprehension were 15 
points and 35 points, respectively. See Appendix for a sample of questions designed for a video. For each group, 
the participants’ totaled correct responses on the MACS and MICS comprehension items were converted into the 
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respective percentage scores. 

3.4 Treatment Materials 

The video materials were selected from an online learning program developed by LiveABC Interactive 
Corporation (2007), a language learning magazine company in Taiwan (see the detailed introduction in Lin’s 
(2011, 2014) studies). The videos were television news programs of the Cable News Network (CNN) featuring 
live scenes, referents, and interaction between the journalist and the interviewee. The titles and the sequence of 
the video materials used in the five sessions were as follows: (1) Grapes in the Gas Tank, (2) Survival of the 
Cutest, (3) Rules of Devotion, (4) Life Everlasting, and (5) Robo Surgeons. 

In this study, three types of treatment material were designed for the three groups, each of which was displayed 
on the students’ individual computers. The T group read only on-screen text. The NT group read on-screen text 
and listen to narrations simultaneously. The on-screen text used in the T and NT conditions was presented in the 
middle of the screen while the text presented in the VNT condition at the bottom of the screen. The on-screen 
text for the T and NT groups included an indication of who was speaking at each point; different speaker’s lines 
were presented in different colors. The VNT group read on-screen text, listened to aural narrations and 
additionally viewed video images appearing in the middle of the screen. 

3.5 Data Collection Procedure 

The full procedure of the study completed in a three-month duration. Three groups of participants were randomly 
assigned to the three presentation conditions: the T group, the NT group, and the VNT group. First, three groups 
of participants completed a RC pretest. The instructor led the participants to practice using the equipment in the 
language lab and gave each group the instructional direction. Second, the five-week treatment began. The 
treatment procedure consisted of five consecutive weekly sessions of 100 minutes each. During the treatment 
section, the participants read, viewed, or listened to a total of five treatment materials at the rate of one per week. 
In each session, the instructor started by reading aloud the title of the assigned material and its Chinese 
equivalent. Afterwards, the participants watched/read/listened to the treatment material as many times as they 
wanted. They received no grammar or vocabulary instruction from the instructor, but they were permitted to use 
online dictionaries to look for the meanings of unfamiliar vocabulary. As soon as they finished practicing with 
the treatment material, they completed an immediate test. Third, two weeks after the treatment, all participants 
completed a RC post-test. 

4. Results 

In total, 98 participants of the three groups completed the RC pre- and post-tests and the immediate test. The 
collected data were analyzed from the quantitative standpoint to address the research question. Descriptive 
statistics (means, and standard deviations) were computed first, and ANOVA tests were further conducted to 
examine the differences between the three groups. Mean scores are correct percentage scores. Mean differences 
in the dependent measures were tested for significance at the 0.05 level. 

4.1 The Immediate Test of MACS and MICS Comprehension 

To examine presentation condition differences in MACS and MICS comprehension, a one-way ANOVA was 
conducted on the immediate test. Table 1 displays the mean scores, and the SDs on the immediate MACS and 
MICS comprehension tests. Regarding MACS comprehension ability, the one-way ANOVA results showed a 
significant main effect for Group, F (2, 95) = 4.85, p = .01. A post hoc Scheffe test revealed that there were 
significant differences between the VNT and T groups, and between the VNT and NT groups. However, there 
was no significant difference between the T and NT groups. From comparisons of the mean scores, the VNT 
group achieved a significantly higher MACS comprehension score (M = 59.03) than the T group (M = 51.83), r 
= .33, p= .037, and the NT group (M= 51.72), r = .32, p = .029. 

Regarding MICS comprehension ability, the one-way ANOVA results also showed a significant main effect for 
Group, F (2, 95) = 7.19, p = .001. A post hoc Scheffe test also revealed that there were significant differences 
between the VNT and T groups, and between the VNT and NT groups. There was no significant difference 
between the T and NT groups. From comparisons of the mean scores, the VNT group achieved a significantly 
higher MICS comprehension score (M = 55.28) than the T group (M = 48.45), r = .36, p = .009, and the NT 
group (M= 48.13), r = .38, p = .005. The results summarized above suggest that different presentation conditions 
appeared to have differential effects on L2 learners’ MACS and MICS comprehension abilities right after they 
received the different treatment materials. The findings suggest that participants exposed to video-based 
materials have better MACS and MICS comprehension of the content than their counterparts receiving treatment 
without video. 
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Table 1. Descriptive statistics for the immediate MACS and MICS tests 


Dependent variable 

Group 

n 

Mean 

SD 

Std. Error 

MACS 

T 

30 

51.83 

9.87 

1.80 


NT 

32 

51.72 

12.42 

2.20 


VNT 

36 

59.03 

10.88 

1.81 

MICS 

T 

30 

48.45 

8.87 

1.62 


NT 

32 

48.13 

8.54 

1.51 


VNT 

36 

55.28 

8.99 

1.50 


4.2 RC MACS Pre- and Post-Tests 

Presentation condition differences in MACS comprehension of the passages were first examined. A repeated 
measures ANOVA was conducted on the RC MACS pre- and post-test scores. Table 2 presents the mean scores 
and the SDs on the RC MACS comprehension pre- and post-tests. In Table 3, the results of a repeated measures 
ANOVA on the RC MACS pre- and post-tests [Test (Pre and Post) x Group (T, NT, and VNT)] show significant 
main effects for Group, Test, and the interaction of both. Figure 1 visually displays these results. Furthermore, 
the one-way ANOVA analysis of the RC MACS post-test reveals a significant main effect of Group, F (2, 95) = 
11.07, p = .000. A post hoc Scheffe test shows that the main effect of Group was due to significant differences 
between the VNT group (M= 77.78) and the T group (M= 65.50), r = .53, p = .000, and between the VNT group 
(M = 77.78) and the NT group (M = 70.46), r = .31, p = .022. Flowever, there was no significant difference 
between the T and NT groups. The VNT group of participants outperformed the other two groups in MACS 
comprehension when they read the passages in the post-test. 


Table 2. Descriptive statistics for the RC MACS and MICS pre- and post-tests 


Dependent variable 

Group N 

Pretest 



Posttest 



Mean 

SD 

Std. Error 

Mean 

SD 

Std. Error 

MACS 

T 

30 

66.83 

9.24 

1.69 

65.50 

9.68 

1.77 


NT 

32 

65.94 

9.46 

1.67 

70.46 

12.01 

2.16 


VNT 

36 

67.22 

11.68 

1.95 

77.78 

10.03 

1.67 

MICS 

T 

30 

64.67 

14.32 

2.61 

70.00 

12.59 

2.30 


NT 

32 

65.94 

13.65 

2.41 

75.63 

15.64 

2.77 


VNT 

36 

65.56 

14.63 

2.44 

75.56 

13.62 

2.27 


Moreover, paired /-tests were further conducted to examine whether there were significant improvements in 
terms of the three groups’ pre- and post-tests. Among the three groups, only the VNT group showed a significant 
difference between their MACS pre- and post-tests (VNT: t = -8.10, p = .000; NT: t = -1.85, p = .07; T: .614, p 
= .54). That is, the VNT group achieved a significantly higher score on the MACS post-test (M = 77.78) than its 
scores on the pretest (M = 67.22), r = -0.44. The finding suggests that participants in the VNT group make 
significant improvement in their MACS comprehension ability. 


Table 3. The results of repeated-measures ANOVA on the RC MACS pre- and post-tests 


Source 

SS 

df 

MS 

F 

P 

Group (T, NT, and VNT) 

1399.40 

2 

699.70 

4.86 

.010* 

Test (Pre and Post) 

1024.01 

1 

1024.01 

13.51 

.000* 

Interaction (Group x Test) 

1160.61 

2 

580.305 

7.66 

.001* 


Note. *p < .05. 
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The results summarized above suggest that different presentation conditions appear to have differential effects on 
L2 learners’ MACS comprehension abilities. Furthermore, they also show a differential effect of Test on VNT 
participants’ MACS comprehension ability, which was fostered after they received the video-based treatment. 

4.3 RC MICS Pre- and Post-Tests 

To examine the presentation condition differences in MICS comprehension, a repeated measures ANOVA was 
conducted on the RC MICS pre- and post-test scores. Table 2 presents the mean scores and the SDs of the RC 
MICS comprehension pre- and post-tests. As shown in Table 4, the results of a repeated measures ANOVA on the 
RC MICS pre- and post-tests [Test (Pre, Post) x Group (T, NT, VNT)] show a significant main effect for Test and 
non-significant effects for Group and the interaction of both. Figure 2 visually presents these results. The results 
of one-way ANOVA analysis of the RC MICS post-test indicate that there were no significant differences in the 
MICS comprehension ability of the three groups. 


Table 4. The results of repeated-measures ANOVA on the RC MICS pre- and post-tests 


Source 

SS 

df 

MS 

F 

P 

Group (T, NT, and VNT) 

462.95 

2 

231.48 

.922 

.401 

Test (Pre and Post) 

3389.10 

1 

3389.10 

22.96 

.000* 

Interaction (Group x Test) 

213.433 

2 

106.72 

.72 

.488 


Note. *p < .05. 



Figure 2. The three groups’ RC MICS pre- and post-tests 


To further examine the main effect for Test, a series of paired 1-tests were performed so that possible significant 
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improvement across the treatments could be measured. The results of the Mests show significant improvements 
from the pre- to post-tests for the VNT and NT groups (VNT: t = -5.45, p = .000; NT: t = -2.98, p = .006; T: t = 
-1.66 ,p = .107). That is, the VNT group achieved a significantly higher score on the MICS post-test (M= 75.56) 
than on the pretest (M= 65.56), r = -0.34; the NT group also achieved a significantly higher score on the MICS 
post-test (M = 75.63) than the pretest (M = 65.93), r = -0.31. Judging from the descriptive statistics in Table 2, 
the participants in the VNT and NT groups made significant progress in their MICS comprehension ability. 

The results suggest that exposure to different presentation conditions does not appear to have a differential effect 
on L2 learners’ MICS comprehension. The results showed, however, a differential effect of Test on the 
participants’ MICS ability, which was facilitated after a period of five weeks for the VNT and NT groups, but not 
for the T group. 

5. Discussion 

This study examined whether video-based material could foster L2 learners’ MACS and MICS comprehension 
abilities. The present study demonstrates the positive impact of video-based materials. Compared to the materials 
presented with concurrent narrations and on-screen text, or text alone, the material supplemented with video and 
narrations could facilitate L2 learners’ comprehension at both the micro- and macro-levels in terms of the 
immediate test. The results fill the gaps found in earlier multimedia reading comprehension studies which only 
focused on learners’ general text comprehension ability when examining the effectiveness of various different 
multimedia conditions. In general, the results of this study suggest a positive effect of video on fostering L2 
learners’ MACS and MICS comprehension abilities in terms of the statistical results of the immediate tests. 
Similarly, this result supports Jones and Plass’ (2002) assumption that pictorial information provided in addition 
to text may “help support micro- and macro-level processing in L2 computer-based reading activities” (p. 548). 
The positive educational effects of video-based materials were concluded from this study, confirming the earlier 
research result that videos can be “a powerful addition to second language acquisition” (Lin, 2014, p. 24). The 
positive effects of video may be attributed to DCT (Paivio, 2007). Consistent with earlier research, these results 
reveal the positive effects of video on L2 learners’ reading comprehension, confirming the assumption of DCT 
that adding pictures to text tends to improve text comprehension because students in the VNT group had access 
to verbal codes (subtitles, narrations) as well as nonverbal codes (images). In the VNT presentation condition, 
the material was presented visually and textually so the participants seemed to be able to construct verbal and 
nonverbal mental representations simultaneously in their verbal and nonverbal working memory systems. With 
the interconnected verbal and nonverbal systems, the two representations in the participants’ minds seemed to be 
able to “be accessed, compared, and used for whatever purpose is relevant in a given situation” (Paivio, 2007, p. 
32). After instructed with video-based materials, the participants likely could manipulate verbal and nonverbal 
information. 

It should be noted that the VNT group outperformed the other two groups in the immediate MICS test. The 
possible reason is that the video (i.e., visual images) presented concrete images of referents, and the learner 
could build mental connections between corresponding words and images or supporting details and images. In 
this way, the VNT participants probably remembered the words and details and had more successful MICS 
comprehension than the T and NT participants. According to Kintsch’s (1998) Cl model, this may well be the 
reason that the participants who received input presented in the form of text, sound, and video likely constructed 
various mental representations, including visual, acoustic, and semantic codes. These representations of the same 
concept repeatedly integrated with the participants’ prior knowledge, and hence probably reinforced their 
impression of the detailed information in the text. MICS comprehension included the acquisition of vocabulary. 
Lin (2010) indicated that video-based materials had positive effects on L2 vocabulary acquisition and explained 
that this was probably because video lent itself to sufficient visual portrayals of verbal information. Similarly, the 
participants in the present study likely learned vocabulary through both aural and visual sensory channels and as 
well had the opportunity to connect the target vocabulary with concrete visual images through the video. As a 
result, the VNT participants were likely able to memorize the target words immediately and gained high scores 
on the immediate MICS tests. 

Based on the significant results of the RC MACS pre- and post-test, it can be noted that after the treatment, the 
participants in the VNT group changed their comprehension pattern, tending to comprehend the text at a 
macro-level. In this study, the finding suggests that video plays an essential role in fostering L2 learners’ MACS 
comprehension ability. This may attribute to the logogen concept in DCT. DCT may be extended to explain the 
macro-structure of mental representations. In addition to smaller language units such as phonemes and words, 
DCT indicates the importance of imagery and concreteness for the comprehension of larger textual units, such as 
sentences, paragraphs, and whole texts. Paivio (2007) adopted the term logogen and regarded it as a variant of 


9 




elt.ccsenet.org 


English Language Teaching 


Vol. 9, No. 10; 2016 


the widely used concept of lexical representation and assumed that logogen reflected “the internal organization 
and variable size of language units as perceived and produced” (p. 37). More specifically, the DCT logogen is 
assumed to build up in a hierarchical structure in which “larger units are composed of different combinations of 
smaller units” (Paivio, 2007, p. 37). Whenever things are remembered as a chunk, a representational unit beyond 
the word level is thus constructed (Paivio, 2007). In this study, it is possible for the VNT group to construct a 
representational unit beyond the word level and “to include stock phrases, idioms, and sequences as long as 
memorized poems, plays, bibles, and oral histories (p. 38)”. With the support of images, large language units can 
be remembered effectively. The findings of the current study offer evidence for DCT and suggest that L2 learners 
are able to successfully handle incoming verbal and nonverbal information simultaneously. Paivio (2007) 
suggested that this instructional technique be designed to teach learners “how to concretize text using imagery 
and dual coding as they read” (p. 446). 

Take the VNT presentation condition as an example. In the video-based presentation condition, video 
accompanied with verbal information was presented in a dynamic, rapid way. In just a brief moment, the video 
conveyed a great amount of meaning. The same meaning could be equally produced through a series of words. 
In such a situation, it appears that the VNT participants could not completely rely on the words to interpret the 
meaning of the text. Instead, they needed to synthesize the information across sentences. In this way, the VNT 
group made significant progress in constructing macrostructure connections in the text. The finding of this study 
suggests that verbal material integrated with video leads to a greater scope of text comprehension than when the 
material is processed through words alone. The results may also provide support to the multiliteracies pedagogy 
(the NLG, 1996). In the VNT condition, video likely offered a simplified syntax for the narration and the caption, 
and decreased clausal complexity of written and oral text. The finding suggests that these multiple models of 
representations are neither discrete nor mutually exclusive; rather, they are supplementary. 

In summary, the significant results of the VNT group lend some support to the notion of video-based L2 
comprehension instruction. The VNT presentation modality formulated a comprehension level similar to 
Kintsch’s (1998) text comprehension model assumed by Caccamise, et al. (2008), that “if reading is 
unproblematic, what readers mainly remember is the gist of the text, that is, the main ideas, topics, and theme of 
an expository text or the plot of a story (p. 84).” Kintsch (1998) indicated that processing at this level requires 
logical thought, formal argument, deduction, and quantification not tied directly to the environment (i.e., the 
words or sentences in the text). Moreover, MACS comprehension constructs “a multidimensional meaning 
representation that may include visual, spatial, temporal, and emotional aspects, as well as abstractions implied 
by the text” (Caccamise, et al., 2008, p. 84). In this study, video-based materials deliver the information in video, 
audio and caption modalities. It appears that the participants integrated these sources to construct a 
macro-structure representation of the text. The results of the present study contribute to facilitating L2 text 
comprehension through the use of video-based materials. 

6. Conclusion 

In general, video-based material can be regarded as a valuable addition to L2 reading comprehension instruction. 
With this kind of material, language learners may immediately comprehend the content of a text at both the 
micro and macro levels well. Specifically, the instruction which is integrated with video-based material gradually 
builds up language learners’ MACS comprehension ability to help them distinguish between the main idea and 
the details of the text, pay attention to the theme of the text, train them to identify the devices the writer has used, 
and draw or predict a conclusion from the text. Armed with the perspectives of multiliteracies and the significant 
results, the current study makes the following instructional recommendations to integrate video in L2 reading 
comprehension instruction. 

First, comprehension should be taught through a curriculum of multiliteracies. That is, the learners can be 
encouraged to interpret the meanings of text through different modality, such as visualizing and verbalizing. 
Cope and Kalantzis (2000) indicated that L2 learners’ meaning-making resources may be found in 
representational objects, patterned in [various] familiar and ... recognisable ways” (p. 10) and further emphasized 
that students can become “fully meaning-makers and remakers of signs and transformers of meaning” (p.10). 
When reading a passage, the teacher may ask learners to connect text segments (i.e., words, phrases, sentences, 
and texts) with images by drawing a picture or describing their images for text segments. When using video, 
learners view the video for a couple of seconds and recall the content of the video by looking at the scene where 
the clip is stopped, and further compare what they recall with the original script. In addition, after viewing the 
video, the teacher may ask learners to describe the content of the clip and illustrate it by showing a couple of 
scenes chosen from the clip for their comprehensibility. To help learners comprehend video effectively, the 
instructors may consult two teaching resources: what methods instructors use to effectively teach a second 
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language through videos (Sherman, 2003) and what strategies learners use to understand video (Lin, 2011). 

Second, MACS comprehension should be noticed. Higher order comprehension can be regarded as an important 
part of successful comprehension (Pressley, 2000). Higher order comprehension involved in cognitive processes 
of prediction, inference, and evaluation deals with synthesizing the broad scope of information in the text. As 
reviewed in this study, the task of MACS comprehension is closely related to that of higher order comprehension 
and thus it plays an essential role in text comprehension, van Dijk and Kintsch (1983) analyzed LI readers’ 
discourse strategies and indicated that there existed overwhelming evidence of a significant role played by 
macrostructure strategies in LI text comprehension. In the case of L2 text comprehension, the instructor should 
pay attention to this aspect and foster the learners’ MACS comprehension ability. At the end of this study, VNT 
participants were trained with better ability to undergo MACS comprehension than their counterparts in the T 
and NT groups. Given the importance of macrostructure comprehension in LI learning and the promotion effect 
of video-based multimedia on L2 learners’ MACS comprehension, multimedia materials integrated with 
on-screen text, narrations, and video are recommended for L2 comprehension instruction. 

At present, these research-based conclusions will initiate the next phase in research on the use of other types of 
multimedia materials in L2 learning contexts to further confirm the effects of pictorial images. However, there 
are some limitations. First, these results are intended to reflect group tendencies and do not apply with equal 
consistency to every participant. In the future, self-reports or one-on-one interviews may be added to explore 
individual perspectives. Second, the present study adopted Pauk’s (2000) model to examine six types of reading 
comprehension strategies. However, the exact strategies that contribute to MACS and MICS comprehension 
could be greater in number and could not be disentangled in this study. Future studies are needed to analyze 
other MACS and MICS comprehension strategies further. Third, the test instruments used in this study were in 
written form rather than aural form. In order to reach a complete understanding of the effect of video-based 
material, listening comprehension tests will be incorporated into future research. 
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APPENDIX 

Five Sample Questions from an Immediate RC Test 

The title: Grapes in the Gas Tank 

Answer the questions based on the information provided in the material. 

1. In general, this article is about_. (Subject Matter) 

a. the winemaking industry in southwestern France 

b. gasoline resources in Tuscany 

c. a solution to the over-production of wine in France 

d. cutting vines yields profits in France 

2. Why does the writer say that the “wine-tasting atmosphere is like a funeral”? (Clarifying Devices) 

a. A third of wine production will be distilled. 

b. Candles and gloom are the setting of a funeral. 

c. Something must be put to death. 

d. Wine consultant rejects to taste the wine. 

3. The word designated means_. (Vocabulary) 

a. appointed b. approved c. planned d. entered 

4. Where is Gibelin’s wine sent after it is tasted by a consultant? (Supporting Details) 
a. The winery. b. The market. c. The distillery. d. The vineyard. 

5. The passage suggests that turning wine into gasoline is_. (Conclusion) 

a. a trend b. an alternative c. a profit d. a disaster 
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